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Abstract 

Background: Gene dosage change is a mild perturbation that is a valuable tool for pathway reconstruction in 
Drosophila. While it is often assumed that reducing gene dose by half leads to two-fold less expression, there is 
partial autosomal dosage compensation in Drosophila, which may be mediated by feedback or buffering in 
expression networks. 

Results: We profiled expression in engineered flies where gene dose was reduced from two to one. While 
expression of most one-dose genes was reduced, the gene-specific dose responses were heterogeneous. 
Expression of two-dose genes that are first-degree neighbors of one-dose genes in novel network models also 
changed and the directionality of change depended on the response of one-dose genes. 

Conclusions: Our data indicate that expression perturbation propagates in network space. Autosomal 
compensation, or the lack thereof is a gene-specific response, largely mediated by interactions with the rest of the 
transcriptome. 
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Background 

Systematic evaluation of gene dose in segmental aneu- 
ploids shows that dose changes in the majority of the 
Drosophila genome are compatible with life [1,2], but if 
there are enough changes in dose, regardless of the par- 
ticular genes, viability is greatly reduced [2]. This sug- 
gests that gene dose changes have small additive effects 
on viability in Drosophila, which may be analogous to 
the situation in humans, where small regions of segmen- 
tal aneuploidy are associated with subtle adult phenotyes 
(for example, disease) and large departures from ploidy 
result in fetal death [3-5]. The small effect of gene dose, 
and the significant additive effects when there are 
enough of these changes, suggest that large departures 
from gene balance collapse genetic networks. 
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Understanding the effect of copy number on gene 
expression is a prerequisite for systematic study of gene 
dose as a network attribute. 

While there are clear dose effects on viability in seg- 
mental aneuploids [2] and in dominant genetic interac- 
tions in Drosophila (for example, [6]), the effect of copy 
number may be less than implied by the gene dose per 
se. One-dose genes in flies heterozygous for deficiencies 
(deletions removing multiple loci) show average expres- 
sion values less than two-fold reduced [7-9]. Expression 
also shows a sublinear relationship to gene dose in 
highly aneuploid Drosophila tissue culture cells [10]. In 
whole Drosophila showing aneuploidy, some genes in 
trisomic regions show compensation, while others do 
not, at both the transcript and protein levels [11,12]. All 
these data indicate that gene dose responses are not 
always a simple reflection of copy number. We do not 
have well-developed models for the important relation- 
ship between gene dose and expression in Drosophila, 
but there are at least two general mechanisms that we 
test here. 

One model for autosomal dosage compensation sug- 
gests that deletions in autosomes are recognized as 
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aneuploid segments and partially compensated in a 
fixed-fold manner independent of the specific gene. 
There is strong evidence that extensive chromosome- 
level aneuploidy results in a characteristic stress 
response in both yeast and mouse cells [13]. One can 
imagine, therefore, that a compensatory response to 
aneuploidy would be advantageous for cells (although 
perhaps not organisms, which might rather purge aneu- 
ploid cells). There is good agreement in average autoso- 
mal dosage compensation levels reported in Drosophila 
[8,9,14], which would be expected if a global aneuploid 
recognition/correction system existed. Indeed, there are 
at least two such systems. Wild-type Drosophila are 
diploid for two major autosomes, a dot autosome (chro- 
mosome 4), and have either one (males) or two x chro- 
mosomes (females). Much of the work on the gene 
dosage in Drosophila has focused on the x chromosome, 
where a chromatin-remodeling machine (the male-speci- 
fic lethal (MSL) complex) recognizes and decorates the 
X to increase gene expression in males [15] by promot- 
ing transcriptional elongation [16]. However, the small 
fourth chromosome is also recognized by a chromatin 
remodeling machine (Painting of fourth, Pof) to increase 
gene expression [7]. An analogous global mechanism 
could partially and uniformly compensate for segmental 
aneuploid regions that arise by mutation on the remain- 
ing two major autosomes. If such a system exists, then 
expression of a common set of genes encoding this 
machinery would be expected to increase in segmental 
aneuploid Drosophila, regardless of the particular loca- 
tion of the aneuploid segment. Even in the case of these 
remodeling systems the relationship between dose and 
expression is not simple. There is an x chromosome- 
specific compensation system in Drosophila that acts in 
the soma during embryogenesis, but not in the germline. 
Some genes on the x chromosome in males show 
dosage compensation prior to the activation of the prin- 
cipal dosage compensation system in the soma [17], and 
X chromosome genes in the mitotic male germline, 
where the somatic dosage compensation system does 
not operate, are tightly dosage compensated [9,18]. 

There are other possible dosage compensation mechan- 
isms. A second model suggests that feedback mechanisms 
[19] and the dampening of dose effects due to the kinetic 
properties of flux through networks [20] result in partial 
dosage compensation depending on the specific gene with 
a dose change. This network model is supported by the 
fact that gene dose manipulation is a powerful pathway 
reconstruction tool in Drosophila, where deficiencies 
result in a sensitized genetic background for discovering 
new pathway members [21,22]. These studies strongly sug- 
gest that gene dose reductions for individual loci result in 
reduced gene activity and a subtle propagation of pertur- 
bations into regulatory networks. In classical genetic 



terms, this suggests that many genes may have subliminal 
haplo-insufficient properties, resulting in synthetic or 
background-dependent phenotypes only when nearby 
gene activities in the pathway are suboptimal. This gene- 
specific response hypothesis makes three clear predictions: 
1) genes should show individual characteristic expression 
responses to reduction in dose; 2) these responses should 
occur in the context of the gene expression network in 
which they are embedded; and 3) expression deviations 
from genes with reduced dose should propagate into the 
expression network. 

Our work on gene expression in a series of Drosophila 
deficiencies analyzed in the context of different network 
models indicates that gene interactions play a large role 
in autosomal dose effects and dosage compensation. We 
suggest that studies in model organisms, with a more 
controlled genetic background and environment, will 
help us parse out the complexities of gene dose effects 
and interactions among large sets of genes that make 
small contributions to overt morphological or physiolo- 
gical phenotypes in development and disease. 

Results 

Drosophila lines with reduced gene dose 

We took advantage of the Drosophila model system by 
measuring genome-wide mRNA expression in engi- 
neered autosomal deficiency lines (Df/+) from the Eur- 
opean Drosophila deletion collection (DrosDel) project 
[22,23]. The DrosDel collection offered a key experi- 
mental advantage, as all strains are from the same origi- 
nal stock, minimizing genetic background outside of the 
engineered deletion. We selected 21 DrosDel deficiency 
lines from chromosome arm 2L to survey mRNA 
responses to gene dose in adult flies (Figure 1). The test 
set allowed us to look at one-dose genes in five regions 
with multiple deficiencies, so that we could explore the 
question of whether compensation is a property of indi- 
vidual genes or particular deficiencies. The Dfs removed 
a variable number of genes and were scattered along the 
length of the chromosome arm. 

While the engineered deletions we used have mole- 
cularly defined breakpoints, spontaneous rearrange- 
ments do occur. Additionally, Drosophila has many 
tissues with variably endoreplicated genomes [24], 
which might provide a corrective amplification. To 
directly assess gene dose in the Df/+ flies, we per- 
formed DNA-sequencing (DNA-Seq, average sequen- 
cing depth 3.2x) on adult females and males of all 21 
lines and the parental line. We aligned genome-wide 
to confirm genotypes (Table 1; Additional file 1), to 
test for selective endoreplication, and to detect any 
novel structural rearrangements. While we did observe 
known selective endoreplication events at the chorion 
loci in females, we found no evidence to support the 
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Figure 1 Lines used. Positional map of the start and stop positions (black bars) for deficiencies iDf(2L)EDs, on left) profiled along chromosome 
2L (gray bars) and the number of coding genes (on right) removed in full or part. Each line had a single deficiency region with the exception of 
one line (asterisk) containing Df(2L)ED478 as well as the de novo Df(2L)Hsp60c. Overiapping deficient regions are shown (dashed lines, open bars). 



idea that selective amplification was part of the dosage 
compensation response. DNA-Seq coverage of wild- 
type autosomes was two-fold higher (standard devia- 
tion (SD) = 0.2) than in the engineered deletion 
regions, indicating that all Dfs reduce gene dose by 
half. In one case (Df(2L)ED478), we found an addi- 
tional uncharacterized deficiency, which we named Df 
(2L)EDHsp60c, but otherwise we detected no overt 
novel rearrangements elsewhere in the genomes, indi- 
cating that dose had not deviated in the time following 
creation of the original engineered deletions. 

We performed expression experiments on both 
females and males because, in a network model for dose 
effects and compensation, the responses to gene dose 
should differ depending on expression context. There is 
a long history of expression profiling between the sexes 
that has clearly shown that females and males have sub- 
stantially different expression networks, due in large 
measure to the gonads and particularly the germ cells 
[9,25-28]. Because of the large gonad size relative to the 
rest of the body, these sex-biased expression profiles are 
quite evident in whole adults. While there are advan- 
tages to examining expression networks by cell type, tis- 
sue or organ, we were concerned about introducing 
dissection as a variable in the experiments; therefore, we 
performed all work on whole females and males. 



Expression of one-dose genes in Df/+ flies 

To determine the overall pattern of dose responses, we 
pooled expression measurements for all 478 one-dose 
genes in the entire set of deficiencies and compared 
expression to a wild-type reference built from the same set 
of experiments (Additional file 2). We then used resam- 
pling methods to compare the expression of similar num- 
bers of one-dose and two-dose genes. Because expression 
of genes physically linked on chromosomes are often corre- 
lated [29], we sampled contiguous blocks of two-dose 
genes along the genome to obviate any effects due to the 
non-random arrangement of genes. As expected, we 
observed lower expression from the one-dose genes (Figure 
2a,b). Females and males showed similar overall responses 
to this copy number change with a mean 1.6-fold reduction 
in expression. This was less expression change than the 
two-fold reduction predicted if expression strictly followed 
gene dose, and is in line with previous studies [7]. 
The distribution of responses around means could be 
due to biological and technical noise layered over a 1.6- 
fold partial-compensation system that the cellular 
machinery applied uniformly to all one-copy genes, or 
gene-specific responses due to feedback. If there were a 
general fixed-fold aneuploid response system, then error 
and noise should be normally distributed around the 
central tendency of 1.6-fold expression compensation. 
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Table 1 DNA-Seq measurements for aneuploid segments 
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Figure 2 Expression of one-dose genes. (a,b) Histograms of expression values for one-dose genes and the same number of two-dose genes 
generated by resampling expression values (2,000x). This corrects for the large sample size differences between these two dose classes. 
Resampling of two-dose genes was restricted to contiguous regions corresponding in gene content to the extent of gene deletion in Dfs to 
control for nonrandom expression values resulting from co-regulation of physically linked genes. The expected value for non-compensation is 
shown (dotted line). Mean expression differences are indicated above each distribution. (c,d) Mean centered distributions of the graphs in (a,b). 
(e) Prevalence of dosage compensation classes (see Materials and methods), (f) Notched boxplots of variance (fold-difference^) calculated for 
sampled (2,000x) one-dose genes due to different Dfs compared to variance of the same genes in a two-dose state. Medians (bar), 95% 
confidence intervals (notch), 25 to 75 percentiles (box), and 1.5 x interquartile range (whiskers) are shown. Wt, wild type. 
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of one-dose and two-dose genes was significant (P < 
0.01, Kolmogorov-Smirnov test), indicating that the 
spread in the expression values of one-dose genes was 
not due to measurement error or biological noise in the 
system. The response to reduced dose was heteroge- 
neous in nature. 

For analyses we will present later in the manuscript, it 
was useful to classify the dosage compensation 
responses. Genes within the bounds of models for fully 
or non-compensated were classified as such. We classi- 
fied genes failing both models {P < 0.05) as anti-, par- 
tially or over-compensated based on the position 
relative to the two models (see Materials and methods). 
Non-compensation and partial compensation classes 
accounted for the most genes, but we also observed 
skewing toward better compensation following classifica- 
tion (Figure 2e). These classifications also show the het- 
erogeneous nature of the dose response, and suggest 
that dosage compensation responses were gene-specific. 

To more directly test for a gene-specific response, we 
asked if the dosage compensation response of a given 
one-dose gene was significantly different when tested in 
the context of different deficiencies, which all uncover 
the same one-dose gene. This test had the added advan- 
tage of exploring the idea that there are deletion-specific 
compensation levels. Such effects might be mediated by 
changes in the complex three-dimensional structure of 
the nucleus arising from deletions and juxtaposition of 
breakpoints. We observed no significant differences in 
expression when the same one-dose gene was measured 
in the context of different deficiencies. Additionally, we 
analyzed variance in expression among one-dose genes 
compared to those same loci when present in two doses. 
There was no significant change in expression variance 
due to gene dose (Figure 2f). We also found no signifi- 
cant correlation between Df extent (amount of DNA or 
number of contiguous genes removed), or position 
along the chromosome, and compensation class (Addi- 
tional file 3). These data indicate that the one-dose 
response was gene-specific. 

We found a clear correlation (P < 0.01) between 
expression level and compensation state. No compensa- 
tion was more prevalent at high expression levels, while 
compensation was more prevalent at lower expression 
levels (Figure 3a,b). However, it is important to note 
that we observed a range of reproducible responses to 
dose at all expression levels. One has to be particularly 
careful with assessing compensation levels in expression 
experiments, as technical noise at low expression levels 
can falsely suggest compensation. We were rigorous 
with low-level cutoff values (>2 SD above background in 
all 21 lines; see Materials and methods), strongly sug- 
gesting that better observed compensation at low 
expression levels was not due to spurious ratiometric 



values due to noise. Importantly, we found support for 
the array data by Illumina and SOLID RNA-Seq on 
three of the deficiency lines, suggesting that genes with 
lower expression levels were indeed more fully compen- 
sated (Figure 3c, d). Furthermore, we used sets of 96 
control RNAs of known abundance as external RNA 
spike-ins produced by the External RNA Control Con- 
sortium (ERCC) [30] to calibrate expression ratios over 
a range of expression values in these experiments. Data 
from the SOLID platform showed linearity between the 
observed and expected abundances and lack of data 
compression (Figure 3e). We have previously demon- 
strated the linearity of Illumina RNA-Seq with these 
controls [31] as well as linearity between RNA-Seq and 
arrays [32]. We concluded that the negative relationship 
between compensation and expression levels was a bio- 
logical phenomenon. 

Gene regulation might explain compensation 
responses of individual genes. For example, genes show- 
ing anti-compensation could be auto-regulatory and the 
loss of one copy might create a downward spiral due to 
loss of positive feedback. It follows that gene-specific 
dosage compensation mediated by network interactions 
should change as the structure of the network and asso- 
ciated gene expression levels changed. 

At the genome-wide level, sex differences in gene 
expression were much more prevalent compared to the 
effect caused by gene dose. Expression profiles showed 
clear signatures of sex, and with the exception of Df 
(2L)ED793/+ females, only very subtle expression dif- 
ferences between lines within a sex (Figure 4a). The 
pervasive effect of sex on gene expression should drive 
the expression of one-dose genes to differing degrees 
in the sexes. If genes with sex-biased expression show 
different dosage compensation responses in females 
and males, this would suggest that compensation was 
network-dependent. To test this, we grouped genes 
detectably expressed in both sexes into those with 
female-, male-, or non-biased expression according to 
a database of sex-biased gene expression [33]. One- 
dose genes with female- or male-biased expression 
(Figure 4b, c), showed much less consistent compensa- 
tion between the sexes (p = 0.47 and 0.49, respectively) 
compared to one-dose genes with non-biased expres- 
sion (p = 0.75; Figure 4d), suggesting that network 
context modulated compensation. 

Expression of two-dose genes in Df/+ flies 

Dosage effects and compensation by interactions within 
networks require perturbation detection by the network, 
which is then followed by feedback correction. In other 
words, perturbation propagation into the wild-type dose 
regions of the genome must precede compensation in 
temporal sequence. While all our observations were on 
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steady-state conditions, we asked if there were any sig- 
natures of propagation in our datasets. Propagation 
would result in differentially expressed genes outside of 
the one-dose regions (two-dose genes, which includes 
dosage-compensated x chromosome genes in the case 
of males). Changes in two-dose gene expression were 
extensive. Collectively, the Dfs we used altered the dose 
of approximately 5% of Drosophila genes, but we 



observed change in approximately 80% of Drosophila 
genes in at least one Dfl + line relative to the parental 
w line. Such changes did not appear to be a general- 
ized response to aneuploid stress, as very few two-dose 
genes changed expression in all lines (nine in females, 
seven in males, and none in both sexes). Additionally, 
we observed no obvious ontology commonalities among 
these genes. The absence of a strong stress response 
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[13] in our flies may be due to the rather Umited num- 
ber of genes with reduced gene dose in many of the 
deficiencies. 

Our data showed that Df/+ genes compensated in the 
absence of a common two-dose genome response. The 
absence of evidence for induced expression (or repres- 
sion) of a characteristic set of genes in the Df/+ flies is 
sensu stricto evidence against a general aneuploid 
response. We therefore used a reference composed of 
median expression values for all Df/+ lines to more 
cleanly examine two-dose gene expression change 
caused by particular Df/+ genotypes. Subtle expression 
change among two-dose genes was extensive and het- 
erogeneous among the 21 lines (mean = 524.6 genes in 
females and 542.5 in males, or about 20 two-dose genes 
per one-dose gene; Additional file 4). 

If changes in two-dose genome expression were due to 
regulatory interactions, then there should be a non- 



random set of changes that can be traced back to a cau- 
sal one-dose gene. We tested for such gene expression 
network coherence and perturbation propagation by 
projecting our data onto the first sex-specific gene- 
expression network models for Drosophila. We con- 
structed these networks from the expression data gener- 
ated here, using mutual information, a quantity 
measuring the dependency between two variables, which 
has an important advantage over simple correlation 
methods, as it incorporates complicated nonlinear 
dependent relationships that better capture the relation- 
ships between complex genotypes and phenotypes [34] 
and relatedness in expression profiles [35]. Briefly, like 
many known biological networks, both our female and 
male networks (Additional files 5 and 6) exhibited scale- 
free properties; however, the power law exponents were 
different between sexes (-1.06 for female versus -1.35 
for male), indicating that the overall structure of the 
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networks differed between females and males (Table 2). 
Gene connectivity was higher in the female-specific net- 
work {P < 0.01 by degree-preserving edge shuffling; this 
method exchanges endpoints of edges under the restric- 
tion that the edges do not already exist in the network), 
and the subnetwork of genes with female-biased expres- 
sion had significantly larger clustering coefficients than 
the subnetwork of genes with male-biased expression. 
Similarly, in the male network, genes with male-biased 
expression showed higher clustering than genes with 
female-biased expression. These differences in network 
structure were due only in part to sex-biased expression, 
as even among genes with non-sex-biased expression 
the clustering coefficients differed significantly in the 
female and male networks. Why the female and male 
mutual information models differ to this degree is not 
entirely clear, but for our purposes here, these two mod- 
els provide distinct and independent frameworks for 
examining the propagation of dosage effects. 

After projecting the expression data onto the net- 
work models we observed patterns of extensive con- 
nection between one-dose and two-dose genes. 
However, due to the large number of possible paths 
through these connections, we explored the relation- 
ship between one-dose genes and the local network, by 
examining expression changes for the unique first- 
degree two-dose neighbors of every one-dose gene in 
the sex-specific mutual information networks. Addi- 
tionally, we also used a pre-existing network model 
that combines genetic interactions from Drosophila 
forward genetics, yeast two-hybrid data, and microar- 
ray expression datasets [36]. For all three networks, we 
calculated the probability of expression change in 
those unique first-degree neighbors among the differ- 
ent compensation categories (over-, partially, fully, 
non-, and anti-compensated). We found that one-dose 
genes with anti-compensated expression had two-dose 
first-degree neighbors with the highest probability of 
expression change (Figure 5a,b) irrespective of network 
model or sex. The propensity for first-degree neighbor 
change in expression networks centered on one-dose 
genes indicates that dosage effects are mediated, at 
least in part, by network interactions. 

Outside of the anti-compensated class, the behavior of 
first-degree two-dose neighbors differed by sex and 



network. In females, there was significant stabilization of 
first-degree neighbor expression surrounding genes with 
partial or full compensation. These results are consistent 
with perturbation spreading, followed by robust resis- 
tance to expression change among neighbors of partially 
and fully compensated genes. In males, however, expres- 
sion of first-degree neighbors approached the global 
average, with the notable exception of the anti-compen- 
sation class. The reason for this sex-difference is 
unclear. 

Simple network interactions among first-degree neigh- 
bors centered on one-dose genes suggest that a given 
one-dose gene directly regulates some neighboring two- 
dose genes. Given that the one-dose gene is the cause of 
the perturbation, then we can determine whether propa- 
gation is due to positive or negative interactions in the 
gene pair. Globally, two-dose genes showed little bias in 
the direction of expression difference (Figure 5c,d). We 
therefore looked for skewing in the direction of two- 
dose gene responses among first-degree neighbors of 
one-dose genes. We observed a strong preference [P < 
0.01, Chi-square test) for lower expression in the first- 
degree neighbors of anti- or non-compensated genes 
and for increased expression in the first-degree neigh- 
bors of partially, fully, and over-compensated genes (Fig- 
ure 5c, d). An example of such a first-degree neighbor 
map centered on a one-dose gene is shown in Figure 5e. 
The non-random nature of first-degree neighbor change 
directionality strongly suggests that there was informa- 
tion flow between the one-dose genes and the surround- 
ing two-dose genes. These relationships were dominated 
by sympathetic responses, suggesting positive regulation. 

There was changed expression of two-dose genes 
beyond what we could unambiguously trace through the 
networks. We asked if these changes in expression were 
coherent by focusing on genes encoding members of 
protein complexes. The Drosophila Protein interaction 
Map (DPiM) is a Drosophila protein-protein interaction 
model for protein complexes based on co-affinity purifi- 
cation followed by mass spectrometry [37]. We exam- 
ined expression changes in the Df/+ lines for two-copy 
genes encoding members of 23 high-confidence multi- 
subunit complex models from DPiM to determine if 
changed expression in one member was associated with 
an enriched chance for expression change in another 



Table 2 Topological statistics for female and male mutual information networks 

Female network Male network 

Expression bias Female Male None All Female Male None All 

Number of nodes 3,456 789 1381 5,933 2,922 2,639 1,978 8,005 

Clustering coefficient 0.44 0.30 0.36 0.44 0.12 0.31 0.19 0.23 

Average neiglibors 76.66 8.29 18.45 89.82 7.73 53.78 9.13 43.72 

Density 0.022 0.01 0.013 0.015 0.003 0.02 0.005 0.005 
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gene encoding a complex member (Figure 6). Of the 966 
cells in the matrix (23 complexes x 21 lines), we 
observed significant co-expression change in 37 cases (P 
< 0.01, hypergeometric test). These data suggest that 
expression changes in the one-dose region of the gen- 
ome preferentially affect two-dose genes encoding mem- 
bers of the same protein complexes. This is strong 
evidence that one-copy genes result in coherent expres- 
sion perturbation in the two-dose genome beyond the 
first-degree neighbors. 

Discussion 

The relationship between DNA dose and gene expres- 
sion in Drosophila is poorly understood, but gene pro- 
duct balance is clearly important [7,38]. For example, 
assembly of multi-subunit complexes such as ribosomes 
is highly sensitive to changes in the dose of constituent 



proteins. Additive effects of massive gene dose devia- 
tions are incompatible with life in Drosophila and, given 
a sensitive assay, the gene doses of single modifiers have 
pronounced phenotypic effects. It follows that there 
should be a response to gene dose beyond the genes 
with reduced dose and some of those changes will 
involve feedback onto the genes with altered dose. Stu- 
dies of Drosophila structural variants have shown partial 
dosage compensation of autosomal genes as measured 
as a population of genes with altered dose [8-10,14]. 
This partial compensation could be the result of uni- 
form compensation of all genes, or heterogeneous 
responses with a characteristic mean. 

Network properties contribute to dosage compensation 

Briefly, we have shown that one-dose genes show indivi- 
dual expression responses to reduction in dose. While 
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we cannot rule out some contribution of a stereotypic 
aneuploid response resulting in partial compensation, 
the overall response is highly heterogeneous. In humans, 
it also appears that individual genes can show very good 
(rare), or negligible (common), compensation, suggesting 
that gene-by-gene compensation is not restricted to 
Drosophila [39]. Inverse effects, where gene expression 
is anti-correlated with gene dose, are well-known in 
maize [40], and we see rare over-compensation resulting 
in higher expression when dose is reduced in Droso- 
phila. We suggest that the compensation of one-dose 
genes occurs in the context of the surrounding gene 
expression network. Similar effects occur in mice, where 
some one-dose genes show compensation during speci- 
fic temporal windows, strongly suggesting feedback reg- 
ulation [41]. We also demonstrate that perturbation 
coherently propagates from one-dose genes into the net- 
work. This indicates that autosomal dosage compensa- 
tion and the consequences of the absence of dosage 
compensation are, at least in part, network properties. 
The differences in compensation in these diverse sys- 
tems may be due to the relative portions of compensa- 
tion gene classes, and/or network architecture, rather 
than a gross difference in gene behavior between 
organisms. 

We showed that the gene-by-gene response to dose 
depends on two related factors; gene expression level 
and network context. Compensation is poorer for highly 
expressed genes, and indeed the relationship between 
compensation and expression shows some hints of con- 
verging on non-compensation at high expression levels. 
Our results are consistent with the observation that tis- 
sue-specifically expressed genes were better compen- 
sated in Drosophila compared to ubiquitously expressed 
genes [8], as non-ubiquitously expressed genes show 
lower expression in whole animal samples. Additionally, 
our data at high expression levels is consistent with the 
response in yeast, where highly expressed genes show 
no dosage compensation at the protein level while a 
minority are perfectly compensated [42]. Our results dif- 
fer from another recent report that highly expressed 
Drosophila genes were better compensated [14]. Because 
of the highly heterogeneous gene-specific response to 
dose, it is quite possible that these differences in conclu- 
sions are due to the particular regions of the genome 
examined. While we do not understand why compensa- 
tion depends on steady-state expression level, it is possi- 
ble that better compensation of poorly expressed genes 
is due to robust control of expression where low abun- 
dance increases deviations due to stochastic noise. As 
an extreme example, a transcript present at, or below, 
one copy per cell must result in wild swings in fold rela- 
tionships to other transcripts, and might be an excellent 
candidate for compensation. Poor compensation of 



highly expressed genes may be due to 'speed limits' 
imposed by maximal rates of transcription for a particu- 
lar arrangement of regulatory sequences at that locus. 

Our study demonstrates that there are coherent pat- 
terns of expression change in potentially co-regulated 
complexes and immediate neighbors of one-dose genes. 
It seems likely that both kinetics and active regulation 
are components of network mediated dosage compensa- 
tion and propagation. Transcription is an enzymatic 
process, and changes in enzyme concentration in path- 
ways are buffered by substrate and reactant concentra- 
tions that depend on other enzymes in the pathway 
[20], such that flux varies between no change and 
change of the same magnitude as the dose change. 
However, buffering does not explain the anti- and over- 
compensation we observed, suggesting that active regu- 
lation via feedback is also a component of dosage com- 
pensation. Buffering and feedback are not mutually 
exclusive. For example, the yeast galactose network 
(involving GAL2, GAL3, GAL4 and GAL80) is robust to 
gene dose alterations through a simple two-component 
system with at least one inhibitor and one activator reg- 
ulating the pathway. However, activator-inhibitor inter- 
actions and stoichiometry in the galactose network have 
profound effects on the robustness of the network [43]. 

Our work suggests that the anti-compensated genes 
might result in the most damage to the rest of the gene 
expression network, or minimally, that the damage is 
more easily traced into the expression network in our 
models. These dose effects indicate that anti-compen- 
sated genes are weakly haploinsufficient and are good 
candidates for pathological variants. At least in females, 
the over-compensated genes also appear to result in 
clear propagation of perturbation and are likely to be 
damaging to the expression network. Another female- 
restricted response suggests that genes with partial or 
full dosage compensation increase the robustness of the 
local expression network centered on those genes. That 
males show different propagation patterns could be due 
to inherent differences between females and males or 
differences in network model quality. Despite these dif- 
ferences, in both female and males we observed sympa- 
thetic changes in expression of first-degree neighbors of 
one-dose genes. These data suggest that most causal 
relationships identified are positive, despite the fact that 
mutual information networks identify both correlated 
and anti-correlated relationships. 

What about sex chromosomes? 

While we do not examine x chromosome dosage com- 
pensation in this manuscript, our findings do have some 
implications for some of the models for sex chromo- 
some dosage compensation. The majority of studies sug- 
gest that the male x chromosome is upregulated to 
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achieve compensation in Drosophila [15]. However, it 
has also been suggested that interaction between the 
autosomes and the x chromosome contributes to x 
chromosome dosage compensation by lowering autoso- 
mal expression in males [38]. The relationship between 
the non-compensated genes and first-degree neighbor 
expression we observe here is sympathetic. If this is also 
true for x chromosome genes, then one effect of non- 
compensation of X-linked genes would be to lower 
expression of first degree neighbors encoded on auto- 
somes. Therefore, x chromosome-autosome interactions 
might act to partially balance gene expression by lower- 
ing autosomal expression. While we have previously 
reported that models calling for up-regulation of the x 
in Drosophila males explains more of dosage compensa- 
tion than possible network interactions with the auto- 
somes [10], it is quite possible that such interactions 
exist. It is perhaps even more likely that these interac- 
tions existed and shaped dosage responses during the 
evolution of the x and Y chromosomes from an ances- 
tral autosome pair [44]. As genes are lost from neo-Y 
chromosomes there is a gradual crisis that is not effec- 
tively controlled by chromosome-wide mechanisms until 
Y-chromosome gene loss is extreme (evolutionarily pre- 
mature dosage compensation would make males func- 
tionally triploid for genes present on the x and Y). Sex 
chromosome-wide mechanisms that have evolved also 
create imbalances. Network interactions between the x 
and autosomes could also contribute to the equilibration 
of x and autosome expression in XY male mammals and 
in XX females following x inactivation [18]. Similarly, 
network effects might also help explain dosage compen- 
sation in the absence of MSL in the early Drosophila 
XY male soma and mitotic germline [9,17,18]. 

Our results also point out the complications of char- 
acterizing sex chromosome dosage compensation in the 
absence of a baseline value for autosomal compensation. 
For example, in light of our findings, it is unclear if par- 
tial sex chromosome compensation in birds [45] is due 
to a generic response to monosomy or a chromosome- 
specific compensation mechanism with limited efficacy. 
On the other hand, possible sex chromosome heteroge- 
neity in baseline compensation in the absence of a chro- 
mosome-wide mechanism also cautions against using 
global expression values to make broad statements 
about sex chromosome dosage compensation. Specifi- 
cally, it has been proposed that intermediate x chromo- 
some compensation in the wild-type Drosophila male 
germline is due to a fixed fold effect of failed x chromo- 
some dosage compensation [46], rather than complica- 
tions due to measuring expression in mixes of cells 
showing dosage compensation, sex-biased gene content, 
and the precocious x chromosome inactivation that 
occurs in male germ cells ranging from Caenorhahditis 



elegans to human [18,47]. The clearest conclusion for 
the study of sex chromosome compensation is that one 
should not assume that the two-fold difference in gene 
dose is easily corrected by a fixed-fold dosage compen- 
sation system, as the baseline expression for 'non-com- 
pensated' sex chromosome genes may well differ among 
sex-linked genes. The study of sex chromosome dosage 
compensation will need to be coupled with studies of 
dosage compensation elsewhere in the genome. 

Building better network models 

Our network modeling shows a common thematic con- 
nection between one-dose genes and the rest of the gen- 
ome, but these models are far from complete and differ, 
for example, in the specific genes we called first-degree 
neighbors. Systematic subtle perturbation using gene 
dose is a good tool for generating better network mod- 
els. Specifically, since we can trace propagating changes 
in engineered Drosophila where the causal dose change 
is known, we can move beyond connectivity to informa- 
tion flow within current network models and use these 
data iteratively to build better models. For example, the 
expression values for a given gene pair connected by an 
edge are the result of one gene regulating the other 
(directly or indirectly) or both genes being co-regulated 
by a common first-degree neighbor. In a positive inter- 
action, the predicted response to an instantaneous gene 
dose reduction is reduced expression of directly regu- 
lated neighbors; however, if a third gene responds to 
one-dose expression by increasing the expression of this 
co-regulated pair, then the first degree neighbor of the 
one-dose gene should be over-expressed. Indeed, we 
observed that partially compensated genes were 
enriched for over-expressed first-degree neighbors in 
both sexes. A larger data set, where each node in a sub- 
network is one-dose in one experiment and two-dose in 
the others, should allow us to unambiguously determine 
if relationships are directional and, if so, whether the 
effect is positive or negative. With better models, we 
should be able to predict information flow, and perhaps 
dose-dependent genetic interactions resulting in oligo- 
genic phenotypes. Finally, if we can establish a basic 
understanding of gene dose responses in Drosophila, we 
may be able to apply basic rules to copy number varia- 
tions associated with human disease, which also appear 
to be mediated by network responses [48]. 

Materials and methods 

Flies and samples 

We obtained flies from the Drosophila stock center 
(Bloomington, IN, USA). We crossed DrosDel males to 
virgin w'"^ females to remove balancer chromosomes. 
We determined that the line initially labeled Df(2L) 
ED748 had the breakpoints reported for Df(2L)ED478, 
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and changed the labeUng in this report accordingly. This 
Une also carried an additional 2L deletion (Figure 1). 
Flies were grown under constant temperature and 
humidity (25°C; 60% relative humidity) on San Diego 
Stock Center cornmeal media [49]. We pooled 50 to 60 
sexed adults (5 days post-eclosion) for RNA extraction 
for each of 3 to 4 replicate preparations. Total RNA was 
extracted using TRIzol® (Invitrogen, Carlsbad, CA, 
USA) and poly A+ mRNA was enriched using Oligotex 
(Qiagen, Valencia, CA, USA) following the manufac- 
turer's instructions. mRNA quality was scored by the 
presence of tight rRNA bands in Bioanalyzer profiles 
(Agilent, Santa Clara, CA, USA). We extracted DNA 
using the LiCl method [50] and quantified on a Nano- 
drop (Thermo Fisher, Wilmington, DE, USA). 

Arrays and sequencing 

All array and sequence data are available from the Gene 
Expression Omnibus (GEO) [51]. See GEO GPL8593 for 
array platform details and GEO GSE31407 for complete 
methods and supplemental information. We used a 12- 
plex 60-mer probe microarray, 080523_D_melanogas- 
ter_5.5_expr (Roche Nimblegen, Madison, WI, USA), 
and performed experiments in at least biological tripli- 
cates as described [10] in a chamber with air passed 
through NoZone ozone scrubbers (SciGene, Sunnyvale, 
CA, USA). All array data were in log2 scale. We normal- 
ized all microarray data triplicates with rank correlation 
>0.8 using Robust Multi-Chip Averaging [52] to produce 
a gene level metric of expression. Two sample hybridiza- 
tions failed to meet this threshold and were not further 
considered. We then set the threshold of detected 
expression at 2 SD above mean hybridization intensity 
to control probes. We demanded that a given gene 
show within-sex expression above this threshold in all 
tested lines. We used two types of references in the 
manuscript. When testing for a global effect of aneu- 
ploidy, we used median expression of the w ^ line as a 
denominator. When we were testing for the effect of 
particular deficiencies, expression ratios were compari- 
sons to a composite reference built from the median 
expression values from all experiments. Expression dif- 
ferences were called by false discovery rate-corrected {P 
< 0.05) moderated f-tests [53,54]. 

For RNA-Seq, 100 to 200 ng of poly-A+ mRNA from 
samples along with external spike-in control libraries 
were prepared for sequencing on a GAII (lUumina, San 
Diego, CA, USA) or SOLID 4 (Life Technologies, 
Carlsbad, CA, USA). We used 8 [55] external control 
RNAs for Illumina RNA-Seq and 96 [31] ERCC exter- 
nal control RNAs for SOLID RNA-Seq. For the ERCC 
controls mix l:mix 2 ratios contain three subsets of 32 
RNAs at 1:1, 1.5:1, and 1:1.5, with a dynamic concen- 
tration range of 2 . Mix 1 was added to wild-type 



mRNA and mix 2 was added to Df/+ mRNAs. For Illu- 
mina runs, we used 36 bp reads that passed default 
parameters. Chastity >0.6 (Illumina). For SOLID runs, 
we used only the forward read and trimmed these 
reads from 50 to 36 bp based on analysis of read qual- 
ity and to make data comparable to Illumina data. For 
DNA-Seq, 5 [ig of DNA was prepared as described 
[10] and sequenced on a GAII or HiSeq 2000 (Illu- 
mina) as outlined for RNA-Seq. 

We used the dm3 Drosophila melanogaster sequence 
build [56] from the UCSC Genome Browser [57] as a 
reference (excluding Uextra) for alignment using Bowtie 
v.0.12.7 settings -v 2 -m 1 [58] and FlyBase r5.29 for 
annotations [59]. We quantified expression using HTSeq 
union mode [60], and used the unique mapping reads to 
calculate reads per kb per million mapped (RPKM) as 
the normalized metric of gene expression. We identified 
the novel D/ breakpoint with rSW-Seq [61] and deter- 
mined fold-difference for aneuploid segments with sam- 
tools [62], which we expressed as reads per million 
(RPM). All but the smallest 5 kb deletion was detected 
using this method. We did not attempt to measure sin- 
gle nucleotide polymorphisms. We visualized expression 
data with Bioconductor tools [63] or MatLab (Math- 
works, Natick, MA, USA), and DNA-Seq coverage with 
Bedtools [64] and the UCSC Browser. 

Network analysis 

We used a two-step procedure to classify the expression 
of one-dose genes into five groups: anti-compensated, 
non-compensated, fully compensated, partially compen- 
sated, and over-compensated. First, using a moderated 
t-test, we tested the null hypothesis that the expression 
of one-copy genes was reduced by half compared to the 
DrosDel reference values for each gene. We rejected the 
null hypothesis for all genes with P < 0.05 (limma pack- 
age from Bioconductor [63] with false discovery rate by 
Benjamini-Hochberg correction [54]). The genes for 
which the null hypothesis was not rejected were classi- 
fied as non-compensated. We classified genes for which 
the null hypothesis was rejected and expression was 
lower than the expected two-fold reduction as anti-com- 
pensated. Next, the genes for which the null hypothesis 
was rejected and the expression was higher than the 
expected two-fold reduction (199 genes in females and 
242 gene in males) were stratified into compensation 
levels - fully compensated, partially compensated and 
over-compensated - using cutoff values defined as fol- 
lows. To set appropriate cutoff values, we first estimated 
the distribution of log fold change in this group by sam- 
pling 1,000 times with repetition, and subsequently 
computed normal distribution based cutoff for the quin- 
tiles 2.5% and 97.5%. These cutoffs where then adjusted 
by subtracting the sampled population mean, thus 
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centering the confidence interval at 'no change' relative 
to the reference. Finally, the genes with mean log-fold 
expression change between the cutoffs were classified as 
fully compensated, the ones above the upper cutoff as 
over-compensated and ones below the lower cutoff as 
partially compensated. 

We used our gene expression data (subtracted mean 
expression for each gene across lines and replicates/SD) 
as a variable and estimated mutual information (MI) for 
all possible pairs of genes by a kernel method [65] to con- 
struct the sex-specific models (kernel width = 0.3; edges 
with MI P < 0.005). Unhke simple correlation, MI tests 
non-linear relationships and does not require that the 
distribution of variables is normal. In addition, MI net- 
works have been shown to perform well on simulated 
data and to be more resilient to estimation errors [66]. 
We used other networks as described by the creators 
[36,37]. Twenty- three protein complexes enriched for 
ontology terms {P < 0.005) and having ten or more mem- 
bers were selected directly from the DPiM network with- 
out further processing [36,37], and we used the 
hypergeometric test for significance of expression change. 

We visualized networks in Cytoscape v.2.8 and used 
the Network Analysis and Random Network plug-ins to 
fit power-law models, generate randomized networks 
and generate descriptive statistics for the female and 
male networks [67,68]. We used the one-versus-every- 
one approach [69] to identify significantly changed 1st 
degree neighbor expression change, and we calculated 
the global probability of changed expression from all 
nodes in each network and compared to observed by 
Chi-square test. Probabilities are indicated in the main 
text. We performed network statistics and analysis in 
MatLab (Mathworks). 
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